Search CORE

8 research outputs found

Introduction to the CoNLL-2003 Shared Task: Language-Independent Named Entity Recognition

Author: De Meulder Fien
Sang Erik F. Tjong Kim
Publication venue
Publication date: 01/01/2003
Field of study

We describe the CoNLL-2003 shared task: language-independent named entity recognition. We give background information on the data sets (English and German) and the evaluation method, present a general overview of the systems that have taken part in the task and discuss their performance

arXiv.org e-Print Archive

CiteSeerX

Tilburg University Repository

Combined optimization of feature selection and algorithm parameters in machine learning of language

Author: Daelemans Walter
De Meulder Fien
Hoste Veronique
Naudts Bart
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2003
Field of study

Comparative machine learning experiments have become an important methodology in empirical approaches to natural language processing (i) to investigate which machine learning algorithms have the 'right bias' to solve specific natural language processing tasks, and (ii) to investigate which sources of information add to accuracy in a learning approach. Using automatic word sense disambiguation as an example task, we show that with the methodology currently used in comparative machine learning experiments, the results may often not be reliable because of the role of and interaction between feature selection and algorithm parameter optimization. We propose genetic algorithms as a practical approach to achieve both higher accuracy within a single approach, and more reliable comparisons

CiteSeerX

Ghent University Academic Bibliography

A named entity recognition system for Dutch

Author: Daelemans Walter
De Meulder Fien
Hoste Veronique
Publication venue: Rodopi
Publication date: 01/01/2002
Field of study

We describe a Named Entity Recognition system for Dutch that combines gazetteers, hand-crafted rules, and machine learning on the basis of seed material. We used gazetteers and a corpus to construct training material for Ripper, a rule learner. Instead of using Ripper to train a complete system, we used many different runs of Ripper in order to derive rules which we then interpreted and implemented in our own, hand-crafted system. This speeded up the building of a hand-crafted system, and allowed us to use many different rule sets in order to improve performance. We discuss the advantages of using machine learning software as a toot in knowledge acquisition, and evaluate the resulting system for Dutch

Ghent University Academic Bibliography

Institutional Repository Universiteit Antwerpen

Tilburg University Repository

Memory-Based Named Entity Recognition Using Unannotated Data

Author: Fien De Meulder
Walter Daelemans
Publication venue
Publication date: 01/01/2003
Field of study

We used the memory-based learner Timbl (Daelemans et al., 2002) to find names in English and German newspaper text. A first system used only the training data, and a number of gazetteers. The results show that gazetteers are not beneficial in the English case, while they are for the German data. Type-token generalization was applied, but also reduced performance

CiteSeerX

Crossref

Institutional Repository Universiteit Antwerpen

Tilburg University Repository

A named entity recognition system for Dutch

Author: Daelemans Walter
de Meulder Fien
Hoste V\ue9ronique
Publication venue
Publication date: 01/01/2002
Field of study

Institutional Repository Universiteit Antwerpen

Combined optimization of feature selection and algorithm parameter interaction in machine learning of language

Author: Daelemans Walter
de Meulder Fien
Hoste V\ue9ronique
Naudts Bart
Publication venue
Publication date: 01/01/2003
Field of study

Institutional Repository Universiteit Antwerpen

Tilburg University Repository

Diversity Checker: Toward recommendations for improving journalism with respect to diversity

Author: Diakopoulos Nicholas
Gangemi Aldo
Humprecht Edda
Kuang Sicong
Lazaridou K.
Ling Xiao
Masini Andrea
Meulder Fien De
Pedregosa Fabian
Sekine Satoshi
van den Bosch Antal
Škrbec Jasna
Štajner Tadej
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 02/07/2018
Field of study

The Diversity Checker is a tool that aims to make it easier for journalists to author their texts with diversity in mind. To provide helpful hints for them in this respect, it is necessary to define how to quantify diversity so that this can be programmed into the tool. At this early stage in the development of the tool, we present a two-fold contribution. First, we offer an analysis on what we mean by "improving diversity". Second, we present the first version of the Diversity Checker, along with some analysis of its current performance.status: Published onlin

Lirias

Crossref